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Abstract 

Background: Deciphering gene regulatory networks by in silico approaches is a crucial step in the study of the 
molecular perturbations that occur in diseases. The development of regulatory maps is a tedious process requiring 
the comprehensive integration of various evidences scattered over biological databases. Thus, the research 
community would greatly benefit from having a unified database storing known and predicted molecular 
interactions. Furthermore, given the intrinsic complexity of the data, the development of new tools offering 
integrated and meaningful visualizations of molecular interactions is necessary to help users drawing new 
hypotheses without being overwhelmed by the density of the subsequent graph. 

Results: We extend the previously developed TranscriptomeBrowser database with a set of tables containing 
1,594,978 human and mouse molecular interactions. The database includes: (i) predicted regulatory interactions 
(computed by scanning vertebrate alignments with a set of 1,213 position weight matrices), (ii) potential regulatory 
interactions inferred from systematic analysis of ChlP-seq experiments, (iii) regulatory interactions curated from the 
literature, (iv) predicted post-transcriptional regulation by micro-RNA, (v) protein kinase-substrate interactions and 
(vi) physical protein-protein interactions. In order to easily retrieve and efficiently analyze these interactions, we 
developed In-teractomeBrowser, a graph-based knowledge browser that comes as a plug-in for Transcriptome- 
Browser. The first objective of InteractomeBrowser is to provide a user-friendly tool to get new insight into any 
gene list by providing a context-specific display of putative regulatory and physical interactions. To achieve this, 
InteractomeBrowser relies on a "cell compartments-based layout" that makes use of a subset of the Gene Ontology 
to map gene products onto relevant cell compartments. This layout is particularly powerful for visual integration of 
heterogeneous biological information and is a productive avenue in generating new hypotheses. The second 
objective of InteractomeBrowser is to fill the gap between interaction databases and dynamic modeling. It is thus 
compatible with the network analysis software Cytoscape and with the Gene Interaction Network simulation 
software (GINsim). We provide examples underlying the benefits of this visualization tool for large gene set analysis 
related to thymocyte differentiation. 

Conclusions: The InteractomeBrowser plugin is a powerful tool to get quick access to a knowledge database that 
includes both predicted and validated molecular interactions. InteractomeBrowser is available through the 
TranscriptomeBrowser framework and can be found at: http://tagc.univ-mrs.fr/tbrowser/. Our database is updated 
on a regular basis. 
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Background 

In the last decade, the advent of high throughput tech- 
nologies led to the emergence of the systems biology era 
and prompted the research community to systematically 
define the expression levels of mRNAs and micro-RNA 
(miRNAs) through thousands of cell and tissues under 
physiological and pathological conditions [1]. Now, one 
of the crucial issues is to define the biological mechan- 
isms that drives genes expression with the ultimate goal 
of reverse-engineering gene regulatory networks (GRN) 
as a whole in order to predict the system outcome 
under molecular perturbations. 

One current limit for biologists interested in mining reg- 
ulatory information or for bioinformaticians interested in 
creating regulatory maps for modeling, is that this infor- 
mation is scattered over the Internet under various for- 
mats making it difficult to handle. Thus one needs to 
create a unified database that would list known and pre- 
dicted molecular interactions. This information can be 
obtained from different sources: (i) from the literature, (ii) 
from large-scale experimental methods that allow gen- 
ome-wide profiling of transcription factors (TFs) binding 
sites to DNA or (iii) from DNA sequence analysis, by 
searching 3'UTR regions for miRNA specific motifs or by 
scanning gene promoters with transcription factor specific 
position weight matrices (PWMs). In the latter case, the 
use of comparative genomics is known to greatly improve 
predictions of functional TF binding sites by limiting the 



number of false positives (though increasing false negative 
rate) [2,3]. Another limit of GRN analysis is the intrinsic 
complexity of the data. In this regard, several graph-based 
tools have been developed to draw a global picture of the 
putative interactions taking place in the biological context 
of interest (for a review, see reference [4]). In these, genes 
or proteins appear as nodes in a graph, and functional 
relations (physical/regulatory interactions) are represented 
as edges connecting the corresponding entities. The topol- 
ogy of the subsequent network can later be analyzed using 
advanced tools such as Cytoscape [5]. However, as data 
integration is a challenge that requires to map various 
types of evidence onto a set of stable gene ids, most appli- 
cations are oriented toward a single data type (mostly reg- 
ulatory or physical interactions, see table 1 for an 
overview) [6-10] Moreover, another challenge is the devel- 
opment of graph-based tools producing clear, meaningful 
and integrated visualizations from which users can draw 
new hypotheses without being overwhelmed by the density 
of the presented graphic information. In this regard, the 
Cytoscape plug-in "Cerebral" proposes an intuitive visuali- 
zation method through a "cell compartment-based layout" 
that shows interacting proteins on a layout resembling 
"traditional" signalling pathway/ system diagrams [11]. 

Here, we sought to create a compendium of predicted 
and validated molecular interactions in human and 
mouse. First, we used a large collection of PWMs 
obtained from TRANSFAC (n = 523), JASPAR (n = 303) 



Table 1 A comparison of web tools dedicated to molecular interactions. 

MIR@NT@N STRING d Motif Map 6 GeneMANIA APID f InnateDB InteractomeBrowser 

Physical protein protein - + + + + + + 

interactions 

Computationally predicted TF + - + - + 

targets 3 

Experimentally observed TF - + 

targets 6 

Database Predicted miRNA targets + - - - + 

content 

Regulatory interactions from - + - - + 

literature 

Biological pathways - + - + - - 

Inferred functional interactions c - + - + - - 

Batch query + + - + - - + 

Build-in graph add/remove/hide inter-actors + + 

visualizer and interactions 

Movable nodes - + ND + + + + 

Compartment-based layout - - - + + 

The table provides an overview of the types of molecular interactions and of the functionalities offered by representative web tools previously published. 
Informations were obtained from latest articles describing the servers. The presence or absence of the corresponding features is denoted by + or - respectively. 
a Refers to bioinformatic prediction of TFBSs using PWMs. 

b Refers to results from large-scale experimental methods that profile the binding of TFs to DNA at the genome-wide level (e.g.; ChlP-Seq, ChlP-chip, ...). 
c Refers to computational methods that aggregate various informations (e.g.; expression, genomic distance, conservation) to infer functional interactions. 
d Search Tool for the Retrieval of Interacting Genes/Proteins 

e MotifMap visualizer was not available during our tests. Informations related to the visualizer were obtained from documentation. 
f Agile Protein Interaction DataAnalyzer 
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and UNIPROBE (n = 387) to search, in gene promoter 
regions, for candidate transcription factor binding sites 
(TFBSs) conserved over human, mouse, rat and dog gen- 
omes [12-14]. Overall, our analysis of these PWMs corre- 
sponding to 347 human and 475 mouse transcription 
factors (TFs) provides a systematic overview of gene reg- 
ulation in the human and mouse. Data generated in this 
study were next integrated with a large set of molecular 
interactions from various sources including (i) potential 
protein/DNA interactions derived from ChlP-seq experi- 
ments (ChlP-X database), (ii) curated regulatory interac- 
tions obtained from the literature (OregAnno, LymphTF- 
DB), (iii) predicted miRNA/targets interactions (TargetS- 
can) (iv) protein kinase-substrate interactions derived 
from multiple online sources (KEA) and (v) physical pro- 
tein-protein interactions obtained from HPRD, Reactome 
and various databases of the IMEx consortium [15-30]. 
Informations related to these interactions were stored as 
MySQL tables that were integrated in the back-end data- 
base of TranscriptomeBrowser, our previously published 
microarray datamining software [31]. Finally, we devel- 
oped InteractomeBrowser (IBrowser) as a plugin for 
TranscriptomeBrowser. IBrowser was developed using 
the prefuse Java library and can be used to translate any 
gene list into a meaningful graph. The specificity of the 
IBrowser plugin relies on a new "cell compartments- 
based layout" that makes use of a subset of the Gene 
Ontology to map gene products onto relevant cell com- 
partments. This layout is particularly powerful for visual 
integration of heterogeneous biological information. 
Moreover, IBrowser is integrated into the Transcripto- 
meBrowser suite, which allows an easy communication 
with other tools, for instance to retrieve lists of genes 
that are frequently coexpressed in given conditions, thus 
creating context-specific views of the interactome and 
regulome. 

IBrowser is intended both for biologists and bioinfor- 
maticians. On one hand, it is a graph-based knowledge 
browser, that is intended to provide new insight into 
any user-defined gene list. On the other hand it is also 
intended to fill the gap between heterogeneous genomic 
data and gene regulatory network analysis. In this 
regard, graphs produced inside IBrowser may be 
exported into Cytoscape and GINsim, a dynamic model- 
ing software [32], In the following sections we provide 
several examples underlying the benefits of this visuali- 
zation tool for large gene set analysis. 

Implementation 

We first used phylogenetic footprinting to predict regu- 
latory elements in the human and mouse genomes. A 
dataset of 1,213 PWMs corresponding to mouse or 
human transcription factors was obtained from various 
sources (TRANSFAC 10.2, JASPAR 2010, UNIPROBE). 



The multiz28way (with hgl8 as a reference) and the 
multiz30way (with mm9 as a reference) cross-species 
multiple alignments were obtained from UCSC [33]. We 
retained for analysis alignments flanking transcription 
start sites on both sides (-3000, 3000) of any RefSeq 
transcript and devoid of coding sequences. Sequences 
were scored following the commonly used formula [34]: 

SCORE = V log 2 ( - ( Sedng Sp+W ^ p0Siti0n W 1 PWAi ) \ 
P ' C ~^ \P (seeing S p+W at position w \ Background model) J 

where SCORE p> c represents the PWM score for a PWM 
of length W in the DNA sequence of a species c between 
positions p and p+W-1 and S p+W represents the nucleotide 
observed at position p+w. The probability of observing 
each nucleotide under the background distribution was 
assumed to be 0.25. For each PWM m, a score threshold 
t m with p-value below 5.10" 5 was computed using matrix- 
distrib from RSAT ensuring high stringency of sequence 
scoring [35]. A sequence in the reference genome was 
considered as a putative TFBS if its score for PWM m at 
position p in the alignment was found above t m in human, 
mouse rat and dog. Each PWM was then linked to its cor- 
responding transcription factors and putative targets. 
Information was stored in a MySQL relational database. 

We also integrated several informations obtained from 
popular databases. Protein/DNA interactions (n = 
174,168) derived from various genome wide analysis (e.g.; 
ChlP-on-chip, ChlP-seq and ChlP-PET) and encompass- 
ing interactions corresponding to 38 human TFs and 55 
mouse TFs were obtained from the ChlP-X database. 
TFBS predictions were obtained from the present work 
(see below) and TFBSConserved UCSC track (n = 
367,829 and n = 686,936 respectively). A set of regulatory 
interactions curated from the literature were obtained 
from LymphTF-DB (392 directed interactions) and Ore- 
gAnno (1,991 interactions). Protein-protein interaction 
datasets were obtained from HPRD (n = 78,325), Reac- 
tome (n = 166,001) and IMEx (n = 110,578). Protein 
kinase-substrate relationships were retrieved from KEA 
(n = 14,084). Finally, miRNA/target relationships were 
obtained from TargetScan database predictions (n = 
260,068). For all datasets, all identifiers were mapped 
onto Entrez Gene ids. This compendium of molecular 
interactions is available as flat files at: ftp://tagc.univ-mrs. 
fr/public/TranscriptomeBrowser/DB_Tables/. 

InteractomeBrowser was developed using the Prefuse 
Java library which was modified according to our needs. 
InteractomeBrowser requires Java 1.6. 

Results and discussion 

TFBS predictions using comparative genomics 

Although previous works have demonstrated the power 
of comparative genomics in defining novel regulatory 
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motifs in human and mouse, few of them integrate the 
PWMs recently computed from protein binding micro- 
array (PBM) experiments. Overall, restricting our analy- 
sis to promoter regions and using a set of 1,213 PWMs, 
we predicted TFBSs in 141,305 position-specific motifs 
of the mouse genome and 164,171 of the human gen- 
ome. The median number of hits for any PWM was 117 
in mouse (mean, 169; range, 3-2,317) and 122 in human 
(mean, 192; range, 6-2,678). The PWMs with highest 
number of hits correspond to Spl transcription factor 
(M00931, M00933, M00196) in both species (additional 
file 1, Figure SI). Spl binds GC-rich elements (consen- 
sus, GGGGCGGGGC) that are found in the promoter 
regions of a large number of genes [36]. As promoter 
regions are known to contain CpG islands we checked 
whether our approach could overestimate the number 
of targets for TF with high GC-content related PWMs. 
As shown in figure SI, this effect was essentially 
restricted to Spl and to a lesser extend to the Maz 
related PWM (consensus, RGGGAGGG). As expected, 
PWMs with high information content were most gener- 
ally associated with fewer motifs (Figure SI, point size). 

Genes with highly conserved promoter regions mostly 
encode transcription factors 

We next estimated the number of predicted regulators 
for each gene by computing the number of non-redun- 
dant PWMs associated with each gene. The number of 
PWMs that have a significant match in gene promoter 
regions range from 1 to 318 (median, 8; mean, 13.37) in 
mouse and 1 to 353 in human (median, 7; mean 13.17). 
Genes in the top 1% considering the number of regula- 
tors (eg; Lmo3, Foxp2, Bell la) were, as expected, invari- 
ably associated with highly conserved promoter regions. 
Moreover, functional annotation indicates that a very 
large proportion of these genes were transcription fac- 
tors and genes related to development. Indeed, in 
mouse, enrichment analysis of the gene list (112 genes) 
using Fisher's exact test (with Benjamini and Hochberg 
correction) indicated a very strong enrichment for genes 
related to terms "Transcription factor" (PANTHER 
TERM; q-value, 1.3.10 27 ; 52 genes out 95 annotated), 
"pattern specification process" (GO biological process; 
q-value, 2.8. 10" 13 ; 19 genes out 78 annotated) or "neu- 
ron differentiation" (GO biological process; q- 
value,1.48.10" 09 ; 18 genes out 78 annotated). Very con- 
cordant results were also observed for human (a sum- 
mary of functional enrichment analysis using the 
ClueGO cytoscape plugin is provided in additional files 
2 and 3, Figure S2 and S3) [37]. Actually, these results 
are in agreement with the work of Bejerano and colla- 
borators that showed that ultraconserved elements of 
the human genome are most often found in genes 
involved in the regulation of transcription and 



development [38]. As a consequence our phylogenetic 
footprinting analysis predicts a higher number of motifs 
in the promoter regions of these genes. Although TFBS 
conservation in mammals has been previously analyzed 
in several papers, none of them, to our knowledge, 
reported this observation that may introduce a bias in 
the analysis. However, these ultraconserved regions may 
also be reminiscent of HOT (high-occupancy target) 
regions identified using ChlP-seq analysis in Caenorhab- 
ditis elegans and Drosophila [39,40]. Indeed, HOT 
regions have been shown to be significantly associated 
with "essential genes" (Le.; having an RNAi phenotype 
of 100% larval arrest, embryonic lethality, or sterility) 
and genes related to growth, reproduction, and larval 
and embryonic development. However, we cannot rule 
out that these ultra-conserved regions may be also 
related to other mechanisms than regulation by site-spe- 
cific TFs 

Biological relevance of the TFBS predictions 

One criterion to assess the reliability of our predictions 
is based on the hypothesis that the overall functional 
properties of the predicted targets can be used to infer 
the biological processes in which TFs are involved. To 
test this hypothesis, we used annotation terms obtained 
from GO (biological process), KEGG, PANTHER, 
PFAM, SMART, PROSITE, and WIKIPATHWAYS 
databases and performed systematic annotation of all 
predicted target sets in the mouse [41]. For each pair of 
term/PWM we computed the Fisher's exact test p-value 
/ Each cell of a matrix with terms (n = 3,905) as row 
and PWM (n = 1,103) as column was filled with a score 
defined as -log(f). We then searched for biclusters inside 
this matrix using "the binary inclusion maximal algo- 
rithm " (BiMax) [42]. Given the amount of information 
produced by this analysis, only some meaningful results 
will be presented and are summarized in Figure 1. Sites 
for PWM related to ETS (M00746, M00971, M00771, 
M00339, MA0136, M00658, M00678), STAT, IRF and 
RUNX (M00722) transcription factor families, known to 
contribute to pathogen responses, were significantly 
over-represented in genes annotated as "immune system 
process" and "lymphocyte activation" (Figure 1A). Sites 
for PWMs related to the Rel/NF-^B pathway were sig- 
nificantly associated with targets related to "induction of 
apoptosis", "Toll-like receptor signaling pathway" and, as 
expected to "NF-kappaB cascade" (Figure IB). More 
subtle biclusters related to immune system were also 
found. As an example, RBPJK specific PWMs (M01112, 
M01111) were statistically significantly associated with 
terms "Notch signaling pathway". Although RBPJK is 
already known to be crucial in NOTCH signaling path- 
way, PWMs related to TCF3 (also known as E2A and 
E47) and AP-4 were also found in the same bicluster 
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Figure 1 Functional enrichment analysis of predicted targets. Annotation terms obtained from various annotation databases were used to 
performed systematic annotation of all predicted target sets in the mouse. For each pair of term/PWM we computed Fisher's exact test p-value 
f. Each cell of a matrix with terms as row and PWM as column was filled with a score defined as -log(f). (A-l) Representative biclusters found with 
BiMax are presented. 



(Figure 1C). This observation is very consistent with the 
known role of these TFs in early B-cell differentiation, a 
development step for which Notch pathway is decisive 
[43,44]. As expected, a bicluster containing almost all 
E2F-related PWMs was also found. Finally, several 
biclusters related to "Muscle contraction", "Phosphorus 
metabolic processes", "Synaptic transmission", "Protein 
catabolic processes" and "Pre-mRNA processing" were 
also observed and are presented in Figure 2E-I. Alto- 
gether, these results highlight the biological relevance of 
the TFBS predictions and provides a systematic over- 
view of putative regulatory interactions in human and 
mouse. These predictions have been termed "TBMC" 
(TranscriptomeBrowser Motif Conservation) and are 
available through the InteractomeBrowser plugin or as a 
bed file (See additional files 4 and 5). 

InteractomeBrowser: graph-based knowledge browser 

The InteractomeBrowser application can be used to 
connect to our database in order to identify and analyze 
molecular interactions (See additional files 6 for a video 
tutorial). Available molecular interactions are derived 



from various sources: our predictions (TBMC) and 
numerous databases including ChlP-X, LymphTF-DB, 
OregAnno, HPRD, IMEx, Reactome, TargetScan and 
KEA. However, InteractomeBrowser may also accept 
additional interaction datasets that users can provide 
through a tabulated flat file. 

InteractomeBrowser relies on a mixed graph that con- 
tains both directed and undirected edges, depicting var- 
ious types of interactions ranging from proteins 
complex formation to transcriptional regulation. Thus 
nodes represent both genes and gene products. 

InteractomeBrowser uses a subset of terms of the Cel- 
lular Component ontology (additional file, 7, figure S4) 
to map nodes onto a schematic and hierarchical view of 
cell compartments (users may choose to disable this 
option). As a consequence, each gene product may be 
represented by several instances (e.g.: one in the nucleus 
and one in the cytosol). 

The nodes placement is controlled by a force-directed 
placement layout: the nodes are repulsive to each other, 
they are attracted to their respective compartments, and 
edges act like springs (the force-directed placement 
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Figure 2 The InteractomeBrowser plugin. (A) A global and zoom-in view of InteractomeBrowser cell-compartment based layout. Zoom-in 

view shows some sub-cellular compartments together with node corresponding to gene products. Note that node corresponding to Esr1 

appears as green, indicating that regulatory information is available for this gene. (B) Positive interactions {i.e.; activations) appear as green edges 

with normal arrowheads (here Notch 1 is the source). (C) Negative interactions (i.e; repressions) appear as red edges with T-shaped arrowheads 

(here Mirn17 is the source). (D) Ambiguous interactions (whose repressive or activating status is unknown) appear as violet arrows with dot 

arrowheads (here with Mycn as source). 
^ J 



layout can be switched off or on at any moment through 
the "Display" menu). Once a graph has been drawn, one 
can easily add or delete nodes. InteractomeBrowser pro- 
vides several filters that are intended to focus on the 
most interesting part of the network. Users can filter 
out orphan nodes and empty compartments. An option 
called "Hide intercompartmental edges" allows users to 
remove several unlikely edges of the network, notably 
those involving physical interactions between distant 
compartments (eg; an instance of gene A in the nucleus 
and an instance of gene B in the extracellular regions). 
When the mouse is over a node or an edge, correspond- 
ing information is provided in the "Infos" tab on the left 
side of the application. Right-clicking on a node opens a 
context menu, allowing users to (i) open the NCBI web 
page for this gene, (ii) add regulatory interactions invol- 
ving this gene and other genes of the network, (iii) 
move the node to another compartment and (iv) 



connect to UCSC genome browser. The action menu 
provides other tools to expand the network: (i) add all 
the interactors of the selected genes or (ii) add common 
interactors of selected genes. 

IBrowser can be used with any user-defined gene list, 
for examples genes of interest in a particular experi- 
ment. Additionally, the integration of this tool into the 
TranscriptomeBrowser suite facilitates the analysis of 
lists corresponding to pre-processed clusters of co- 
expressed genes stored in the database. 

The next part of the result and discussion section 
demonstrates the use of InteractomeBrowser for retriev- 
ing molecular interactions in the context of thymocyte 
differentiation analysis. 

Case study: early T-cell development in mouse 

The development of mature T cells from lymphoid pro- 
genitor cells involves a series of cell fate choices that 
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direct differentiation. In the context of the Immunologi- 
cal Genome Project (ImmGen), M.W. Painter et al used 
rigorously standardized conditions to analyze expression 
levels of protein-coding gene in almost all defined T-cell 
populations of the mouse [45]. Using SAM analysis 
(FDR 15%), we selected a set of 281 genes repressed 
during the transition from thymic DN3 stage to DN4 
stage. Careful analysis, indicated that this gene set was 
highly enriched in genes previously shown to be cru- 
cially involved during the first step of thymocyte devel- 
opment. This includes cell surface markers such as 
Il2ra/Cd25, and Il7r together with several transcriptional 
regulators, including Notchl, Smarca4/Brgl, Dtxl/Del- 
texl, and Hesl/Hry. More recently, Neilson et al identi- 
fied specific miRNAs enriched at distinct stages of 
thymocyte development by deep sequencing [46]. The 
authors showed that transcripts of the mirl7 family are 
up-regulated at DN4 stage and thus could be involved 
in the repression of DN3 specific messenger RNAs dur- 
ing DN3 to DN4 transition. We thus combined one 
member of the mirl7 family, Mirnl7/Mirl7, with the 
mRNA gene list mentioned above. This gene list was 
provided as input to InteractomeBrowser. Figure 2A 
shows node placement according to cellular compart- 
ment. As shown in Figure 2A and 2B this layout is 
extremely useful to directly focus on genes of interest. 
Indeed, the nucleus subnetwork contains several regula- 
tors (e.g; Runxl, Notchl, Hesl and Xbpl) some of them 
colored in green, indicating available regulatory interac- 
tions for the transcription factor in our database. Figure 
2B shows that several genes (Dtxl, Hesl, Il7r and Bcl2) 
have been previously shown to be under the positive 
control of Notchl (these curated informations are 
derived from LymphTF-DB). According to TargetScan 
predictions, Mirnl7/Mirl7 does not seem to target any 
component of the Notch pathway. In contrast, it is pre- 
dicted to affect the expression of several transcription 
regulators including Mycn, Runxl, Smad7 and the 
H3K27 methyltransferase Ezhl (by default miRNA are 
considered as having a negative effect on mRNA and 
thus edges appear as T-shaped arrows). Moreover, it 
may also control key components of the cell cycle 
machinery: Ccnd2 and Cdknla. Figure 2D shows infor- 
mations available from ChlP-X database regarding 
Mycn. These informations are derived from a ChlP-seq 
experiment performed on mouse embryonic stem cells 
by Chen et al [47]. Note that according to these results, 
Mycn could target several transcription factors and thus 
play a key role during DN3 to DN4 transition. However, 
in this cellular context such results should be inter- 
preted with caution since no large scale analysis of 
MYCN targets in DN3 Thymocytes has been reported 
so far. Among Mycn potential targets, Notchl, is one 
master switch of early to late thymocyte developmental 



transition. Thus, one could hypothesize that Mirnl7/ 
Mirl7 may indirectly affect Notchl by negatively regu- 
lating Mycn. Although, these hypotheses rely on predic- 
tions and on the assumption that Mycn binding to 
Notch promoter is effective in DN3 thymocyte, it clearly 
underlines the potential of this software in helping 
researchers to draw new hypotheses using data 
integration. 

Conclusions 

InteractomeBrowser and its underlying approach can be 
compared to the Cerebral (Cell Region-Based Rendering 
And Layout) plugin of Cytoscape that also combines 
molecular interactions with a cell-compartment based 
layout [11]. 

But there are qualitative differences in the conception 
of Cerebral and InteractomeBrowser, which make the 
latest an interesting alternative for exploring networks. 

On one hand, Cerebral uses a layered representation 
of the cell to create a "pathway-like" view of the net- 
work of interacting proteins. This layout thus provides a 
linear organisation of the network. On the other hand, 
the layout of InteractomeBrowser is based on a sche- 
matic view of the entire cell and displays the hierarchi- 
cal structure of the underlying Gene Ontology subset as 
nested zones. First, this helps visually separating differ- 
ent parts of the network corresponding to different cel- 
lular localisations, as in Cerebral. But this is a more 
generic visualisation method, in the sense it does not 
restrict the visual message to an 'input-intermediates- 
output' mechanism such as in linear pathway diagrams. 
As a consequence it is suited for a more general study 
of various types of networks. Moreover, since visual 
zones correspond to Gene Ontology terms, this layout 
handles different levels of accuracy in the localisation of 
proteins: for instance a precisely-annotated protein 
might be placed in the zone corresponding to "endo- 
plasmic reticulum", while a less well-annotated can be 
placed in the more generic, higher level zone 
"intracellular". 

In Cerebral, each gene product is represented by one 
instance whose cell compartment may be defined by the 
user. In contrast, InteractomeBrowser displays, by 
default, several instances of a given gene product that 
may be placed in several cell-compartments according 
to informations provided by the GO Cellular-component 
ontology. Although this may lead to a more complex 
graph, it provides a more exhaustive presentation of cur- 
rent knowledge and may draw the attention of users to 
unexpected locations of gene products in the cells. The 
user may choose to delete some of these instances 
hence selecting a posteriori the most representative one. 

The main benefit of InteractomeBrowser resides in its 
direct interaction with the database described in this 
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report. Indeed, it provides a ready-to-use web-based ser- 
vice that requires only few manipulations to retrieve a 
network of interactions (see video tutorial provided as 
additional file). Notably, in addition to physical interac- 
tions it offers a unified access to miRNA targets and 
results from ChlP-Seq experiments derived from CHEA. 

Presently, the data sources associated with the Interac- 
tomeBrowser plug-in are restricted to human and 
mouse. Indeed, one of the main objectives of Interacto- 
meBrowser is to help users in creating regulatory maps 
to study human gene regulatory networks in physiologi- 
cal and pathological conditions. The choice of mouse as 
an additional organism supported by our database is a 
natural choice as it is a widely used model of human 
physiopathology. However, we are already planning to 
add new organisms in the near future. 

As more and more experimentally validated interac- 
tions are available, we hope that this tool will prove very 
useful for researchers. 

Availability and requirements 

InteractomeBrowser comes as a plugin for Transcripto- 
meBrowser and is available at: http://tagc.univ-mrs.fr/ 
tbrowser/. Our database is updated on a regular basis. 
See additional files for a video tutorial. 

♦ Project name: InteractomeBrowser 

♦ Project home page: http://tagc.univ-mrs.fr/tbrowser/ 

♦ Operating system(s): Platform independent (Java) 

♦ Programming language: Java 

♦ Other requirements: Java > 1.6.X 

♦ License: no license required 

♦ Any restrictions to use by non-academics: none 

List of abbreviation used 

PWM: Position Weight Matrices; GRN: gene regulatory 
network; GO: Gene Ontology; micro RNA: miRNA; TF: 
transcription factors; TFBS: transcription factor binding 
site; TBMC: TranscriptomeBrowser Motif Conservation. 

Additional material 



genes. Genes in the top 1% regards to the number of regulators were 
used as input for the ClueGO plugin. 

Additional file 4: "TFBS predictions in the mouse genome". A bed 

file containing TFBS predictions in the mouse genome. 1 - chrom - The 
name of the chromosome. Fields contain the following informations: 
chromStart - The starting position of the feature in the chromosome; 
chromEnd - The ending position of the feature in the chromosome; 
name - PWM identifier and representative names; score - A score for the 
PWM hit; strand - Defines the strand - either '+' or gene id - The gene 
id of the target gene; geneSymbol- The genesymbol of the target gene. 

Additional file 5: "TFBS predictions in the human genome" A bed 

file containing TFBS predictions in the human genome. 1 - chrom - The 
name of the chromosome. Fields contain the following informations: 
chromStart - The starting position of the feature in the chromosome; 
chromEnd - The ending position of the feature in the chromosome; 
name - PWM identifier and representative names; score - A score for the 
PWM hit; strand - Defines the strand - either '+' or gene id - The gene 
id of the target gene; geneSymbol- The genesymbol of the target gene. 

Additional file 6: "InteractomeBrowser functionalities". Contains a 
web link to a screencast showing basic use of InteractomeBrowser 
plugin. 

Additional file 7: "Subset of Gene Ontology used for the cell 
compartment-based layout". Hierarchical structure of the subset of 
Gene Ontology used in InteractomeBrowser for the cell compartment- 
based layout. Colors highlight the main compartments. 
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